Method and related apparatus for motion estimation

ABSTRACT

Disclosed is a motion estimation method for selecting a target motion vector from a plurality of candidate motion vectors in a search range of a target picture for an encoding block having a lot of pixels in a picture is disclosed. The method comprises: utilizing a distortion function to calculate difference between the pixels of the encoding block and the pixels in the search range of a target picture to generate at least one distortion value; utilizing an entropy function to determine the distribution of the difference between the pixels of the encoding block and the pixels in the search range of blocks corresponding to candidate motion vectors to generate at least one distribution values; summing up the distortion values and the distribution values to generate at least one sum value; and selecting the target motion vector according to the sum value.

CROSS REFERENCE TO RELATED APPLICATIONS

This application is a continuation application of applicant's earlierapplication, Ser. No. 10/904,421, filed Nov. 9, 2004, and is includedherein by reference.

BACKGROUND OF INVENTION

1. Field of the Invention

The invention relates to a method and related apparatus for motionestimation in a video compression system, and more particularly, to amethod and related apparatus for motion estimation using a costfunction.

2. Description of the Prior Art

As multimedia technology develops, more and more standards related tovideo compression have been introduced. For instance, various versionsof MPEG are standards for digital video compression, and ITU H.261,H.263, ISO 10918 are other examples.

MPEG defines a standard for digital video compression. A motion pictureis composed of a series of pictures, and each picture can be regarded asa two-dimensional array composed of a plurality of pixels, which iscalled a frame of the motion picture. MPEG standard defines four typesof different pictures: I picture, which is encoded without referring toany other pictures; P picture, which is encoded through motionestimation referring to a previous I picture or P picture; B picture,which is encoded through motion estimation referring to a following Ipicture or P picture; and D picture, which is used in fast forwardsearch mode.

Video compression systems complying with the standards mentioned aboveutilize motion estimation technology based on blocks or macroblocks inorder to reduce the temporal redundancy. During motion estimation, for acurrent encoding block in a current picture, the video compressionsystem will find a best matching block, which is the most similar to thecurrent encoding block, from a target picture. In this case, for thecurrent encoding block, the video compression system can store (ortransmit) the motion vector and the residual calculated to representdata included in the current encoding block (wherein the residualrepresents a pixel value difference between the current encoding blockand the best matching block).

According to the prior art, when the video compression system searchesfor the best matching block from a search range, a cost function called“sum of absolute difference” is used, which is obtained as follows:${{SAD}( {x,y} )} = {\sum\limits_{i = {i\quad 0}}^{i\quad 1}{\sum\limits_{j = {j\quad 0}}^{j\quad 1}{{C_{i,j} - P_{{i + x},{j + y}}}}}}$

(x, y) is a candidate motion vector in the search range, (i1-i0)*(j1-j0)is the size of the current encoding block, C_(ij) is a pixel in thecurrent encoding block, and P_(i+x,j+y) is a pixel in the search rangeof the target picture.

The conventional video compression system finds a candidate motionvector (x, y), which minimizes the cost function, to be the optimalmotion vector (x1, y1) of the current encoding block. Such a method isfor finding the best matching block having the smallest residual so thatthe residual can be better compressed. However, the found optimal motionvector (x1, y1) may not result in better compression; thus U.S. Pat. No.5,847,776 discloses another cost function that considers not only thesum of absolute difference but also the volume of the motion vectorduring the searching for the optimal motion vector so that a balance canbe kept between the found optimal motion vector and a residualcorresponding to it.

However, most video compression systems utilize a discrete cosinetransform (DCT) algorithm to transform the residual in a spatial domaininto a frequency domain during the compressing of the residual. Then thevideo compression system utilizes a corresponding quantization matrixand a quantization step Qp, which changes according to a bit rateselected by the system, to quantize the residual in the frequencydomain. Since the quantized matrix is a two-dimensional matrix, thesystem further utilizes zig-zag scan or alternate scan to scan thequantized two-dimensional data into one-dimensional data. Finally, thevideo compression system operates variable length coding.

During variable length coding, the smaller the frequency distributionrange of the residual in the frequency domain is, the shorter the codelength of the encoded residual is (i.e. the better compressed theresidual is). In other word, the smaller of the frequency distributionrange of the residual means that the degree of disorder or randomness ofthe residual is smaller, i.e. entropy (a measurement or function ofdisorder or randomness) of the residual is smaller. However, neither theprior art nor the method disclosed in U.S. Pat. No. 5,847,776 can findthe best matching block with the residual that has the smallestfrequency distribution range or has the smaller entropy. Even in thecase that the found best matching block results in a residual in thespatial domain having the smallest sum of absolute difference, after ithas been operated on by DCT, quantization process, zig-zag scan (orother scan methods), variable length coding, the residual may notnecessarily have the shortest code length, meaning that the optimalcompression cannot be achieved. This is a main problem in the prior art.

Recently, a new MPEG technique H.264 is disclosed, which calculates bitrate for each candidate motion vector to determine an optimal motionvector. However, such method will cause heavy calculation loading and istime consuming for processing such calculation.

SUMMARY OF INVENTION

It is therefore a primary objective of the claimed invention to providea method and related apparatus utilizing an entropy function and adistortion function to solve the problems mentioned above.

The disclosed embodiment of the present invention discloses a motionestimation method for selecting a target motion vector from a pluralityof candidate motion vectors in a search range of a target picture for anencoding block having a lot of pixels in a picture is disclosed. Themethod comprises: utilizing a distortion function to calculatedifference between the pixels of the encoding block and the pixels inthe search range of the target picture to generate at least onedistortion value; utilizing an entropy function to determine thedistribution or variation of the difference between the pixels of theencoding block and the pixels in the search range of blockscorresponding to candidate motion vectors to generate at least onedistribution values; summing up the distortion values and thedistribution values to generate at least one sum value; and selectingthe target motion vector according to the sum value.

The disclosed embodiment also discloses a motion estimation device forselecting a target motion vector from a plurality of candidate motionvectors in a search range for an encoding block having a lot of pixels.The device comprises: a distortion calculator for utilizing a distortionfunction for calculating difference between the pixels of the encodingblock and the pixels in the search range of a target picture to generateat least one distortion value; a spatial variation calculating modulefor utilizing an entropy function to determine the distribution orvariation of the difference between the pixels of the encoding block andthe pixels in the search range corresponding to candidate motion vectorsto generate at least one distribution value; and a motion vectordetermining module, coupled with the distortion calculator and thespatial variation calculating module, for summing up the distortionvalue and the distribution value to generate at least one sum value andfor determining the target motion vector according to the sum value.

These and other objectives of the present invention will no doubt becomeobvious to those of ordinary skill in the art after reading thefollowing detailed description of the preferred embodiment that isillustrated in the various figures and drawings.

BRIEF DESCRIPTION OF DRAWINGS

FIG. 1 illustrates a flowchart of a method according to the presentinvention.

FIG. 2 illustrates a motion estimation device in the video compressionsystem according to the present invention.

DETAILED DESCRIPTION

Please refer to FIG. 1 showing a flowchart of a method according to thepresent invention. The method can be used in a video compression systemfor motion estimation. The video compression system divides a currentpicture into a plurality of blocks. The method is shown in FIG. 1 asfollows:

Step 110: Step through a plurality of candidate motion vectors (x, y) ina search range for a current encoding block of a current picture. Thecurrent encoding block includes (i1-i0)*(j1-j0) pixels.

Step 120: Calculate a cost function for each candidate motion vector (x,y) as follows:${{CF}( {x,y} )} = {\sum\limits_{i = {i\quad 0}}^{i\quad 1}{\sum\limits_{j = {j\quad 0}}^{j\quad 1}\{ {{{C_{i,j} - P_{{i + x},{j + y}}}} + {h( {{Q\quad p},C_{i,j},P_{{i + x},{j + y}}} )}} \}}}$

wherein C_(ij) is a pixel in the current encoding block, P_(i+x,j+y) isa pixel in the search range of the target picture, and Qp is aquantization step. As well-known by persons skilled in the art,$\sum\limits_{i = {i\quad 0}}^{i\quad 1}{\sum\limits_{j = {j\quad 0}}^{j\quad 1}{{C_{i,j} - P_{{i + x},{j + y}}}}}$is a distortion function for computing the difference between the pixelsof the encoding block and the pixels in the search range of a targetpicture.

Step 130: Determine the candidate motion vector (x, y) in the searchrange that meet a desired condition, such as minimizes the costfunction, to be a target motion vector (x1, y1) of the current encodingblock. Wherein, the target motion vector which minimizes the costfunction may regarded as an optimal motion vector.

The present invention is different from the prior art because the costfunction used in the present invention considers not only a sum ofabsolute difference calculated from a distortion function (i.e. member|C_(i,j)−P_(i+x,j+y)|) but also other members corresponding to thequantization step Qp and the distribution or variation of the sum ofabsolute difference (i.e. member h(Qp, C_(i,j), P_(i+x,j+y))).

For instance, function h(Qp, C_(i,j), P_(i+x,j+y)) can be represented asK×f(Qp)×g(|C_(i,j)−P_(i+x,j+y)|), wherein K is a constant. Therefore,function f(Qp) is a monotonic increasing function, meaning that thelarger Qp is, the larger function f(Qp) is. In other words, f(Qp) can beregarded as an amplifying factor for amplifying the effect ofg(|C_(i,j), P_(i+x,j+y)|). Therefore, g(|C_(i,j),P_(i+x,j+y)|) still canbe used to the present invention even if f(Qp) is removed. In this caseh(Qp, C_(i,j), P_(i+x,j+y)) has a more important influence on the costfunction. Function g(|C_(i,j)−P_(i+x,j+y)|) is for roughly calculatingor determining the variation of the pixel difference between the currentencoding block and a block corresponding to the current candidate motionvector (x, y) (i.e. representing the distribution or variation of theresidual in the frequency domain). Generally, the smaller$\sum\limits_{i = {i\quad 0}}^{i\quad 1}{\sum\limits_{j = {j\quad 0}}^{j\quad 1}{g( {{C_{i,j} - P_{{i + x},{j + y}}}} )}}$is, the better compressed the residual resulting from the candidatemotion vector (x, y) and operated by DCT, quantization process, zig-zagscan (or other scan methods), and variable length coding.

Here is an example of function g(|C_(i,j)−P_(i+x,j+y)|):${g( {{C_{i,j} - P_{{i + x},{j + y}}}} )} = \{ \begin{matrix}{{{{C_{i,j} - P_{{i + x},{j + y}}}} - {ad\_ max}},} & {{{{if}\quad{{C_{i,j} - P_{{i + x},{j + y}}}}} > {ad\_ max}},{( {i,j} ) \neq ( {0,0} )}} \\{{{ad\_ min} - {{C_{i,j} - P_{{i + x},{j + y}}}}},} & {{{{if}\quad{{C_{i,j} - P_{{i + x},{j + y}}}}} < {ad\_ min}},{( {i,j} ) \neq ( {0,0} )}} \\{0,} & {else}\end{matrix} $

Wherein ad_max and ad_min are shown as follows:${ad\_ max} = \{ {{\begin{matrix}{{{C_{0,0} - P_{x,y}}},} & {{{if}\quad( {i,j} )} = ( {0,0} )} \\{{\max\lbrack {{{C_{i,j} - P_{{i + x},{j + y}}}},{ad\_ max}} \rbrack},} & {{{if}\quad( {i,j} )} \neq ( {0,0} )}\end{matrix}{ad\_ min}} = \{ \begin{matrix}{{{C_{0,0} - P_{x,y}}},} & {{{if}\quad( {i,j} )} = ( {0,0} )} \\{{\min\lbrack {{{C_{i,j} - P_{{i + x},{j + y}}}},{ad\_ min}} \rbrack},} & {{{if}\quad( {i,j} )} \neq ( {0,0} )}\end{matrix} } $

In this example, whenever a function g(|C_(i,j)−P_(i+x,j+y)|)corresponding to a set of (i, j) is calculated, ad_max and ad_min can berefreshed for the later calculation of functionsg(|C_(i,j)−P_(i+x,j+y)|) corresponding to (i, j).

Take the case of a 2*3 block, (i1-i0)=2, (j1-j0)=3. Imagine if there areonly two candidate motion vectors in the search range respectivelycorresponding to a first block and a second block, the absolute value ofthe pixel difference between the current encoding block and the firstblock {8, 9, 6, 8, 7, 6}, and the absolute value of the pixel differencebetween the current encoding block and the second block {5, 10, 4, 22,0, 0}. Calculating the cost function by the sum of absolute differenceaccording to the prior art, the cost functions of the first block andthe second block are as follows:

first block: 8+9+6+8+7+6=44

second block: 5+10+4+22=41

Therefore the second block is taken as the best matching block for thecurrent encoding block according to the prior art.

However, according to the present invention, where one assumes K=1,f(Qp)=1, the cost functions of the first block and the second block areas follows:

first block: [8+0]+[9+(9−8)]+[6+(8−6)]+[8+0]+[7+0]+[6+0]=47

second block:[5+0]+[10+(10−5)]+[4+(5−4)]+[22+(22−10)]+[0+(4−0)]+[0+0]=63

Obviously, according to the present invention, the first block is takenas the best matching block for the current encoding block. This is verydifferent from the prior art.

According to function g(|C_(i,j)−P_(i+x,j+y)|) described above, the costfunction disclosed in the present invention can be simplified asfollows:${{CF}( {x,y} )} = {{\sum\limits_{i = {i\quad 0}}^{i\quad 1}{\sum\limits_{j = {j\quad 0}}^{j\quad 1}{{C_{i,j} - P_{{i + x},{j + y}}}}}} + {K \cdot {f( {Q\quad p} )} \cdot ( {{ad\_ max} - {ad\_ min}} )}}$

Wherein ad_max and ad_min are respectively the maximum and the minimumof |C_(i,j)−P_(i+x,j+y)| between (i, j)=(i0, j0) and (i, j)=(i1, j1).

The above-mentioned method can be summarized as follows:

First, utilize a distortion function such as |C_(i,j)−P_(i+x,j+y)| tocalculate difference between the pixels of the encoding block and thepixels in the search range of a target picture to generate at least onedistortion value.

Second, utilize an entropy function such as h(Qp, C_(i,j), P_(i+x,j+y))or g(|C_(i,j)−P_(i+x,j+y)|) to calculate the distribution or variationof the difference between the pixels of the encoding block and thepixels of blocks corresponding to candidate motion vectors to generateat least one distribution values.

Third, sum up the distortion values and the distribution values togenerate at least one sum value. For example, utilize a cost functionsuch as${{{CF}( {x,y} )} = {\sum\limits_{i = {i\quad 0}}^{i\quad 1}{\sum\limits_{j = {j\quad 0}}^{j\quad 1}\{ {{{C_{i,j} - P_{{i + x},{j + y}}}} + {h( {{Q\quad p},C_{i,j},P_{{i + x},{j + y}}} )}} \}}}},{or}$${{CF}( {x,y} )} = {\sum\limits_{i = {i\quad 0}}^{i\quad 1}{\sum\limits_{j = {j\quad 0}}^{j\quad 1}\{ {{{C_{i,j} - P_{{i + x},{j + y}}}} + {K \cdot {g( {C_{i,j},P_{{i + x},{j + y}}} )}}} \}}}$to calculate the sum value.

Fourth, select the target motion vector according to the sum value.

That is, according to the present invention, the larger distributionrange or entropy of the pixel difference between the current encodingblock and a block corresponding to a candidate motion vector (x, y) inthe target picture (in the example above, the larger the differencebetween the maximum and the minimum), the larger the cost function is sothat it is less easier to be selected as the best matching block for thecurrent encoding block.

Please refer to FIG. 2 showing a motion estimation device 200 in thevideo compression system according to the present invention. The videocompression system divides a current picture into a plurality of blocks.For a current encoding block in the current picture, the device 200 candetermine an optimal motion vector (x1, y1) from a plurality ofcandidate motion vectors (x, y) in a search range, wherein the currentencoding block includes (i1-10)*(j1-j0) pixels. As shown in FIG. 2, thedevice 200 includes an distortion calculator 220 for calculatingdistortion between the pixels of the encoding block and the pixels inthe search range of a target picture. For example, the distortion can beobtained from |C_(i,j)−P_(i+x,j+y)|, wherein C_(ij) is a pixel in thecurrent encoding block, and P_(i+x,j+y) is a pixel in the search rangeof the target picture; a spatial variation calculating module 240 forcalculating distribution value by utilizing an entropy function such ash(Qp, C_(i,j), P_(i+x,j+y)), wherein Qp is a quantization step; and amotion vector determining module 260 coupled with the distortioncalculator 220 and the spatial variation calculating module 240 forsumming up the distortion values and the variation values to generate atleast one sum value and for determining the target motion vectoraccording to the sum value. For example, the spatial variationcalculating module 240 calculates a cost function as follows:$\sum\limits_{i = {i\quad 0}}^{i\quad 1}{\sum\limits_{j = {j\quad 0}}^{j\quad 1}\{ {{{C_{i,j} - P_{{i + x},{j + y}}}} + {h( {{Q\quad p},C_{i,j},P_{{i + x},{j + y}}} )}} \}}$

For instance, similar to the above description, function h(Qp, C_(i,j),P_(i+x,j+y)) can be represented as K×∫(Qp)×g(|C_(i,j)−P_(i+x,j+y)|);therefore in device 200, the spatial variation calculating module 240includes a first calculator 242 for calculating functiong(|C_(i,j)−P_(i+x,j+y)|), a second calculator 244 for calculatingfunction f(Qp), and a multiplier 246 coupled with the first calculator242 and the second calculator 244 for calculatingK×f(Qp)×g(|C_(i,j)−P_(i+x,j+y)|). As described above, f(Qp) can beregarded as an amplifying factor for amplifying the effect ofg(|C_(i,j), P_(i+x,j+y)|). Therefore, g(|C_(i,j), P_(i+x,j+y)|) stillcan be used to the present invention even if f(Qp) is removed.

The motion vector determining module 260 includes an adder 262 coupledwith the distortion calculator 220 and the spatial variation calculatingmodule 240 for adding one of the distortion value and one of thedistribution value (for example, |C_(i,j)−P_(i+x,j+y)|+h(Qp, C_(i,j),P_(i+x,j+y)), an accumulator 264 coupled with the adder 262 forcalculating the sum of the distortion values and distribution values(for example,$ {\sum\limits_{i = {i\quad 0}}^{i\quad 1}{\sum\limits_{j = {j\quad 0}}^{j\quad 1}\{ {{{C_{i,j} - P_{{i + x},{j + y}}}} + {h( {{Q\quad p},C_{i,j},P_{{i + x},{j + y}}} )}} \}}} ),$and a determiner 270 coupled with the accumulator 246 for determiningthe candidate motion vector (x, y), which meet a desired condition suchas minimizes the cost function, to be the target motion vector (x1, y1)of the current encoding block. Wherein, the target motion vector whichminimizes the cost function may regard as an optimal motion vector.

For storing a smaller value of the sum value and comparing different sumvalues generated from different candidate motion vectors as shown inFIG. 2, the determiner 70 includes a comparator 272 coupled with theaccumulator 264 for outputting the targetI motion vector (x1, y1), and astorage 274 coupled with the comparator 272 for storing the smallervalue of the sum values,

In contrast to the prior art, the optimal motion vector in the searchrange of the current encoding block can be obtained according to a costfunction different from that in the prior art. Since the cost functionof the present invention is influenced by the variation of the pixeldifference between the current encoding block and a block correspondingto the current candidate motion vector (x, y), the present invention canprovide a better compression efficiency on the residual corresponding tothe optimal motion vector after processing (by DCT, quantizationprocess, zig-zag scan and variable length coding).

Also, comparing with the above-mentioned H.264, the present inventiondoes not need to calculate the bit rate of each candidate motion vectorsand only needs a distortion function and an entropy function to select atarget motion vector. Thus the calculating loading can be decreased andthe speed can be increased.

Those skilled in the art will readily observe that numerousmodifications and alterations of the device and method may be made whileretaining the teachings of the invention. Accordingly, the abovedisclosure should be construed as limited only by the metes and boundsof the appended claims.

1. A motion estimation method for selecting a target motion vector froma plurality of candidate motion vectors in a search range of a targetpicture for an encoding block having a lot of pixels, the methodcomprising; utilizing a distortion function to calculate differencebetween the pixels of the encoding block and the pixels in the searchrange to generate at least one distortion value; utilizing an entropyfunction to determine the distribution of the difference between thepixels of the encoding block and the pixels in the search rangecorresponding to candidate motion vectors to generate at least onedistribution value; summing up the distortion value and the distributionvalue to generate at least one sum value; and selecting the targetmotionvector according to the sum value.
 2. The method of claim 1, wherein thedistortion function is |C_(i,j)−P_(i+x,j+y)|, wherein C_(ij) is thepixel in the encoding block, and P_(i+x,j+y) is the pixel in the searchrange of the target picture.
 3. The method of claim 1, wherein theentropy function is g(|C_(i,j), P_(i+x,j+y)|), wherein C_(ij) is thepixel in the encoding block, and P_(i+x,j+y) is the pixel in the searchrange of the target picture.
 4. The method of claim 1, wherein theentropy function is h(Qp, C_(i,j), P_(i+x,j+y)), wherein C_(ij) is thepixel in the encoding block, and P_(i+x,j+y) is the pixel in the searchrange of the target picture.
 5. The method of claim 4 wherein h(Qp,C_(i,j), P_(i+x,j+y)) is K×f(Qp)×g(|C_(i,j)−P_(i+x,j+y)|), where K is aconstant and the function f(Qp) is a monotonic increasing function. 6.The method of claim 4 wherein the larger Qp is, the larger the functionh(Qp, C_(i,j), P_(i+x,j+y)) is.
 7. A motion estimation device forselecting a target motion vector from a plurality of candidate motionvectors in a search range for an encoding block having a lot of pixels,the device comprising; a distortion calculator for utilizing adistortion function for calculating difference between the pixels of theencoding block and the pixels in the search range of a target picture togenerate at least one distortion value; a spatial variation calculatingmodule for utilizing an entropy function to calculate the distributionof the difference between the pixels of the encoding block and thepixels in the search range corresponding to candidate motion vectors togenerate at least one distribution value; and a motion vectordetermining module, coupled with the distortion calculator and thespatial variation calculating module, for summing up the distortionvalue and the distribution value to generate at least one sum value andfor determining the target motion vector according to the sum value. 8.The device of claim 7, wherein the distortion function is|C_(i,j)−P_(i+x,j+y)|, wherein C_(ij) is the pixel in the encodingblock, and Pi+x,j+y is the pixel in the search range of the targetpicture.
 9. The device of claim 7, wherein the entropy function isg(|C_(i,j), P_(i+x,j+y)|), wherein C_(ij) is the pixel in the encodingblock, and P_(i+x,j+y) is the pixel in the search range of the targetpicture.
 10. The device of claim 7, wherein the entropy function ish(Qp, C_(i,j), P_(i+x,j+y)), wherein C_(ij) is the pixel in the encodingblock, and P_(i+x,j+y) is the pixel in the search range of the targetpicture.
 11. The device of claim 10, wherein h(Qp, C_(i,j), P_(i+x,j+y))can be represented as K×f(Qp)×g(|C_(i,j)−P_(i+x,j+y)|), K being aconstant, and function f(Qp) being a monotonic increasing function. 12.The device of claim 11, wherein the spatial variation calculating modulecomprises: a first calculator for calculating functiong(|C_(i,j)−P_(i+x,j+y)|); a second calculator for calculating functionf(Qp); and a multiplier coupled with the first calculator and the secondcalculator for calculating K×f(Qp)×g(|C_(i,j)−P_(i+x,j+y)|).
 13. Thedevice of claim 10, wherein the larger Qp is, the larger the functionh(Qp, C_(i,j), P_(i+x,j+y)) is.
 14. The device of claim 7, wherein themotion vector determining module comprises: an adder, coupled with thedistortion calculator and the spatial variation calculating module foradding the distortion value to the distribution value; an accumulator,coupled with the adder, for calculating the sum value; and a determiner,coupled with the accumulator for determining the target motion vector.15. The device of claim 14, wherein the determiner comprises: acomparator, coupled with the accumulator, for outputting the targetmotion vector; and a storage, coupled with the comparator for storingthe minimum value of the sum values that ever appears.